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IV 



Chapter 1 

Suggested Plan for Teaching the Course 



Each chapter is interactive. Students should fill in the blanks and answer the questions. 

At the end of each chapter is at least one practice. The practice leads the students step-by-step through 
problems. We, the authors, start the practices in calss with students working in groups of 2, 3, or 4. The 
students finish the practices at home. The practice is after the chapter reading but before the homework. 

The back of the book contains answers to the odd-numbered homework problems. In this plan (this 
document), the suggested homework is listed at the end of the chapter discussion. 

At the end of each chapter (after the homework), there is at least one lab. The labs use real data collected 
by the instructor or the students or both. We often use the class to collect data. Labs may be done in groups 
and are an excellent teaching tool especially if they are started in class. The book contains the following 
labs: 

Ch. 1: Data Collection Lab I (number of movies viewed) 

Ch. 1: Sampling Experiment Lab II (table of restaurants provided) 

Ch. 2: Descriptive Statistics Lab (number of pairs of shoes) 

Ch. 3: Probability Lab (counting M&M's) 

Ch. 4: Discrete Distribution Lab I (picking playing cards) 

Ch. 4: Discrete Distribution Lab II (Tet game) 

Ch. 5: Continuous Distribution Lab (generate random numbers) 

Ch. 6: Normal Distribution Lab I (Terry Vogel's lap times provided) 

Ch. 6: Normal Distribution Lab II (measure pinkie fingers) 

Ch. 7: Central Limit Theorem Lab I (counting change) 

Ch. 7: Central Limit Theorem Lab II (cookie recipes) 

Ch. 8: Confidence Interval Lab I (real estate prices) 

Ch. 8: Confidence Interval Lab II (students born in state) 

Ch. 8: Confidence Interval Lab III (heights of women) 

Ch. 9: Hypothesis Testing Lab - Single Mean and Single Proportion (3 tests) 

Ch. 10: Hypothesis Testing Lab - Two Means and Two Proportions (3 tests) 

Ch. 11: Chi-Square Goodness of Fit Lab I (grocery store receipts) 

Ch. 11: Chi-Square Test for Independence Lab II (favorite snack/gender) 

Ch. 12: Regression Lab I (distance from school vs. cost of supplies this term) 

Ch. 12: Regression Lab II (number of pages in textbook vs. cost of textbook) 

Ch. 12: Regression Lab II (weights vs. fuel efficiency) 

Ch. 13: ANOVA Lab (fruits, vegetables, breads) 



1 This content is available online at <http://cnx.org/content/ml7214/1.6/>. 



2 CHAPTER 1 . SUGGESTED PLAN FOR TEACHING THE COURSE 

Because the authors use technology heavily in the course (making many class periods a lab), we typically 
choose to do 6 labs during the quarter. The labs are best done in groups of 2, 3, or 4. 

There are five projects in the book. The Univariate Data project covers the ideas in chapters 1 and 2. 
The Continuous Distributions and Central Limit Theorem project covers idea in chatters 5, 6, and 7. The 
Hypothesis Testing - Article and the Hypothesis Testing - Word project covers ideas in chapters 8 and 9. 
The Bivariate Data, Linear Regression and Univariate project covers ideas in chapters 1, 2, and 12. Projects 
are done in groups of 2, 3, or 4. 

There are Practice Finals with answers and Data Sets in the text. One of the Chapter 6 Labs uses one of the 
data sets. Going over the Table of Contents for this collection with the students is recommended. 

We carry probabilities to 4 decimal places. 

The number of days (a "day" is a 50 minute period) based on a quarter system (10 weeks of class, 1 week 
of finals) it takes to cover a chapter is below. At De Anza, we are on a quarter system. In a semester, you 
could spend more time analyzing real data. The material is meant to be covered in one quarter or in one 
semester. 

• Introduction - 2 days 

• Descriptive Statistics - 4 days 

• Probability Topics - 4 days 

• Discrete Random Variables - 5 days 

• Continuous Random Variables - 3 days 

• The Normal Distribution - 3 days 

• The Central Limit Theorem - 3 days 

• Confidence Intervals - 4 days 

• Hypothesis Testing - Single Mean and Single Proportion - 4 days 

• Hypothesis Testing - Two Means and Two Proportions - 4 days 

• The Chi-Square Distribution - 4 days 

• Linear Regression and Correlation - 4 days 

• Analysis of Variance and F Distribution - 3 days 



Chapter 2 

Ch. 1: Sampling and Data 



Explain the terms statistics and probability. 

Introduce the key terms by an example. 

Example 2.1 

Students may be interested in the average time (in years) it will take them to earn a B.A. or B.S. 
Differentiate between population and sample. 

Explain data. The book discusses qualitative and quantitative data. Quantitative data is either discrete 
(countable) or continuous (measurable). 

Types of Data 

• Qualitative data - the city or town a student lives in. 

• Quantitative discrete (countable) data - the number of T-shirts a student owns. 

• Quantitative continuous (measurable) data - the amount of time (in hours) a student studies statistics 
each day. 

Sampling 

Discuss what a sample is. Stress the importance of sampling randomly and the fact that two random 
samples from the same population may be different. Doing the two experiments with a fair die (roll the 
same die 20 times for each experiment and record the frequencies of the faces in the book) will help them 
understand how samples vary. Using your class as the population, sample 10 men and 10 women. Let the 
sample be the number of pairs of shoes each student owns. This example illustrates samples which are not 
representative from the same population. 

Discuss how to sample data. Though there are numerous ways, the book discusses simple random, strati- 
fied, cluster, systematic, and convenience. You may want to discuss other ways of sampling. 

Frequency 

The last part of the chapter discusses frequency, relative frequency, and cumulative relative frequency. The 
students should understand how to read the table in the example (heights, to the nearest inch, of male 
students at ABC College). 

Assign Practice 

Take some class time and have the students work in groups and complete the Practice 2 . 



lr rhis content is available online at <http://cnx.Org/content/ml6130/l.10/>. 
2 "Sampling and Data: Practice 1" <http://cnx.org/content/ml6016/latest/> 



CHAPTER 2. CH. 1: SAMPLING AND DATA 



Assign Homework 

Assign Homework 3 problems: 1-17 odds, 19 - 27. 



"Sampling and Data: Homework" <http://cnx.org/content/ml6010/latest/> 



Chapter 3 

Ch. 2: Descriptive Statistics 



Graphs are important tools in statistics and probability. Graphs used in this course are the boxplot, the 
histogram, and the stem-plot. The histogram and boxplot are used extensively while the stem-plot is just 
demonstrated. 

Illustrate Examples 

• To illustrate stem-plots, have the students complete Example 2-2 by hand. 

• To illustrate histograms, have the students do Example 2-4 by hand and then, if you are using tech- 
nology, have them do the same example. They can verify their results by looking at the picture. 

• Right after Example 2-4, there is an "Optional Collaborative Classroom Exercise" for the students to 
do that involves the amount of money they have in their pocket or purse. 

• To illustrate the boxplot, have the students do Example 2-6. In this example, they will compare two 
boxplots. 

Center of Data 

Discuss the measures of "center" - mean (average), median, mode. If you are using technology, it helps 
to show the students how to use technology to find the measures first. Then do some examples by hand. 
Distinguish between the symbols used for the sample mean and the population mean. Give an example 
where the mean is the best measure of the center and a second example where the median is the best 
example. (Example where median is the better measure: 19, 16, 46, 18, 21. Example where mean is the 
better measure: 18, 20, 23, 25, 25.) At the end of the chapter, there is a summary of the mean formulas if you 
desire to go over them. 

Spread of Data 

Discuss the measures of spread - variance and standard deviation. Stress that the standard deviation is 
the square root of the variance. Differentiate between the sample and population standard deviations. 
Dividing by n — 1 in the sample variance formula makes the sample standard deviation a better estimator 
of the population standard deviation. Do one example by hand and have the students participate (the set 
{1, 2, 3} is quick and easy). They will have to calculate the mean first. They should discover how easy it is 
to make a numerical error when they calculate standard deviation by hand. 

Location of Data 

Discuss the measures of location - quartile and percentile. For many students, these measures are difficult. 
It is better to make up a relative frequency table from an example like the one in the book (the amount of 
sleep 50 students get per school night) and find quartiles and percentiles. Graphing calculators typically 
calculate quartiles. 



1 This content is available online at <http://cnx.Org/content/ml6802/l.10/>. 



6 CHAPTER 3. CH. 2: DESCRIPTIVE STATISTICS 

Definition of Value 

We introduce the formula Value = Mean + (#ofSTDEVs)(Standard Deviation) in this chapter. For example, 
a student with a 74 on the first exam in a statistics class wants to compare his score to a student who received 
a 70 in another section. If the mean and standard deviation for the first class was 72 and 4, respectively, and 
the mean and standard deviation for the second class was 68 and 2, respectively, which student did better 
relative to the class? Solve the equation for #OfSTDEVs in each case. 

Assign Practice 

Have students work in groups to complete Practice l 2 and Practice 2 3 . 

Calculator Instructions 

If you are using the TI-83 or TI-84 calculator series, go over the calculator instructions in the text for en- 
tering data and calculating the sample mean, the sample standard deviation, the quartiles, constructing 
histograms, and construction boxplots. The calculator instructions can also be found on the Texas Instru- 
ments website and the appropriate Guidebook. 

Assign Homework 

Assign Homework 4 . Suggested problems: 1-23 odds, 24 - 30. 



2 "Descriptive Statistics: Practice 1" <http://cnx.org/content/ml6312/latest/> 
3 "Descriptive Statistics: Practice 2" <http://cnx.org/content/ml7105/latest/> 
4 "Descriptive Statistics: Homework" <http://cnx.org/content/ml6801/latest/> 



Chapter 4 

Ch. 3: Probability Topics 



The best way to introduce the terms is through examples. You can introduce the terms experiment, out- 
come, sample space, event, probability, equally likely, conditional, mutually exclusive events, and indepen- 
dent events AND you can introduce the addition rule, the multiplication rule with the following example: 
In a box (you cannot see into it), there are are 4 red cards numbered 1, 2, 3, 4 and 9 green cards numbered 1, 
2, 3, 4, 5, 6, 7, 8, 9. You randomly draw one card (experiment). Let R be the event the card is red. Let G be 
the event the card is green. Let E be the event the card has an even number on it. 

Example 4.1 

Event Card Example 

1. List all possible outcomes (the sample space). Have students list the sample space in the 
form jRl, R2, R3, R4, Gl, G2, G3, G4, G5, G6, G7, G8, G9}. Each outcome is equally likely. 
Plane outcome = ^ . 

2. Find P (R) . 

3. Find P (G) . G is the complement of R. P (G) + P (R) = . 



4. P ( red card given a that the card has an even number on it) = P (R | E) .This is a conditional. 
Pick the red card out of the even cards. There are 6 even cards. 

5. Find P (R AND E). (Multiplication Rule: P (R and E) = P (E | R) (P (R) ) 

6. P(ROR E). (Addition Rule: P (R OR E) = P (E) + P (R) - P (E ANDR)) 

7. Are the events R and G mutually exclusive? Why or why not? 

8. Are the events G and E independent? Why or why not? 

Example 4.2 

(Optional Topic) A Venn diagram is a tool that helps to simplify probability problems. Introduce 
a Venn diagram using an example. Example: Suppose 40% of the students at ABC College belong 
to a club and 50% of the student body work part time. Five percent of the student body works part 
time and belongs to a club. 

Have the students work in groups to draw an appropriate Venn diagram after you have shown 
them what a Venn diagram basically looks like. The diagram should consist of a rectangle with 
two overlapping circles. One rectangle represents the students who belong to a club (40%) and the 
other circle represents those students who work part time (50%). The overlapping part are those 
students who belong to a club and who work part time (5%). 

Find the following: 

1. P(student works part time but does not belong to a club) 



lr rhis content is available online at <http://cnx.Org/content/ml6844/l.ll/>. 



CHAPTER 4. CH. 3: PROBABILITY TOPICS 



2. P(student belongs to a club given that the student works part time) 

3. P(student does not belong to a club) 

4. P(works part time given that the student belongs to a club) 

5. P(student belongs to a club or the student works part time) 

Solution 




Figure 4.1 



C - student belongs to a club 
PT - student works part time 



Example 4.3 



Find the following: 

1. P(a child is 9 - 11 years old) 

2. P(a child prefers regular soccer camp) 

3. P(a child is 9 - 11 years old and prefers regular soccer camp) 

4. P(a child is 9 - 11 years old or prefers regular soccer camp) 

5. P(a child is over 14 given that the child prefers micro soccer camp) 

6. P(a child prefers micro soccer camp given that the child is over 14) 



Tree Diagrams (Optional Topic) 

A tree is another probability tool. Many probability problems are simplified by a tree diagram. To exemplify 
this, suppose you want to draw two cards, one at a time, without replacement from the box of 4 red cards 
and 9 green cards. 




72GG 36GR 36RG 



l !I Diaw 



3R 2°" Draw 



12RR 



Figure 4.2: There are (13)(12) = 156 Possible Outcomes, (ex. R1R1, R1R2, R1G3, G3G4, etc.) 



Example 4.4 

Find the following: 

1. P(RR) 

2. P(RGorGR) 

3. P(at most one G in two draws) 

4. P(G on the 2nd draw I R on the 1st draw). The size of the sample space has been reduced to 
12 + 36 = 481. 

5. P(no R on the 1st draw) 



Introduce contingency tables as another tool to calculate probabilities. Let's suppose an owner of a soccer 
camp for children keeps information concerning the type of soccer camp the children prefer and their ages. 
The data is for 572 children. 



Type of Soccer Camp Preference 


Under 6 


6-8 


9-11 


12-14 


Over 14 


Row Total 


Micro 


42 


76 


46 


25 


10 


199 


Regular 


8 


68 


92 


105 


100 


373 


Column Total 


50 


144 


138 


130 


110 


572 



Table 4.1 

Assign Practice 

Assign Practice l 2 and Practice 2 3 in class. Have students work in groups. 

Assign Lab 

The Probability Lab is an excellent way to cement many of the ideas of probability. The lab is a group effort 
(3-4 students per group). 

Assign Homework 

Assign Homework 4 . Suggested problems: 1-15 odds, 19, 20, 21, 23, 27, 28 - 30. 

2 "Probability Topics: Practice" <http://cnx.org/content/ml6839/latest/> 
3 "Probability Topics: Practice II" <http://cnx.org/content/ml6840/latest/> 
^'Probability Topics: Homework" <http://cnx.org/content/ml6836/latest/> 
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Chapter 5 

Ch. 4: Discrete Random Variables 



This chapter introduces expected value (long term average) and four of the common discrete random 
variables (binomial, geometric, hypergeometric, and Poisson). The authors cover expected value and two 
of the discrete random variables (binomial and Poisson). Depending on your background, you may want 
to cover the binomial (usually required) together with none or some of the other discrete random variables 

Random Variables 

Explain random variable (assigns numerical values to the outcomes of a statistical experiment). Upper case 
letters denote random variables. Example: Let X = the number of cars in your household. (The phrase "the 
number of" tells you that X takes on discrete values.) X takes on the values 0, 1, 2, 3, ... 

The Probability Distribution Function 

A probability distribution function (pdf) is best shown with an example: A controversial drug is given to 

two patients. Let X = the number of patients cured. 



• P(a cure) = 

• P(no cure) 



A pdf is easiest to understand in a table. 



X 


P(X) orP(X = x) 





*(x = o) = (J) (!) = (&) 


1 


P(X = l)=2(l)(l) = (f 6 ) 


2 


P(X = 2) = (|)(|) = (g) 



Table 5.1: Each probability is between and 1. 

The previous example can be used as an example of expected value or long term average ( ji). Make a third 

column labeled (x) (P (x)). Calculate the three values and add them. The result, (0) ( 35 ) + (1) ( 35 ) + 

(2) (||) = jg = 1.67, is the expected number of patients who are cured if the drug is administered many 
times to two patients. 

The binomial is a special discrete pdf or pattern. A binomial experiment consists of counting the number 
of successes in one or more Bernoulli trials. (A Bernoulli trial has only two possible outcomes, success or 
failure. In every Bernoulli trial, the probability of a success (or failure) remains the same.) 



lr rhis content is available online at <http://cnx.Org/content/ml6834/l.13/>. 
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12 CHAPTER 5. CH. 4: DISCRETE RANDOM VARIABLES 

Example 5.1 

John conies to his stat class and discovers he must take a true-false quiz . There are 20 questions 
on the quiz. John has not attended class recently and must guess randomly at the questions. Let 
X = the number of questions John answers correctly out of 20 questions. X takes on the values 
0, 1, 2, 3, ..., 20. P (correct answer: a success) = 0.5. John's guessing at the answers is a binomial 
experiment. 

Notation: X ~ B (20, 0.5) where the number of trials, n , is 20 and the probability of a success, p, 
on any trial is 0.5. 

Students can find the mean (pi = np ), and the standard deviation (a = square root of npq) either 
by hand or with technology. ( q is the probability of a failure.) Have students help you fill in the 
blanks and answer the questions: 

1. cr = 

2. Draw the graph, (horizontal axis is the number of successes; vertical is the probability of 
successes, 1 success, 2 successes, ..., 20 successes. Draw vertical lines or boxes. 

3. • What is the probability that John gets 15 questions correct? P (X = 15) 

• More than 15 questions correct? P (X > 15) 

• At least 15 questions correct? (P (X = 15) + P (X > 15)) 



A geometric experiment takes place when at least one Bernoulli trial is performed and all are failures 
except the last one which is the only success. Example: Liz likes to play darts. The probability that she hits 
the bull's eye (success) on any throw is 85%. (Liz is good!) Liz throws darts at the bull's eye until she hits 
it. Let X = the number of times Liz throws the dart at the bull's eye until she hits it. Have students help you 
fill in the blanks: 

Fill in the blanks. 

• X<~ ( X ~ G (p) where p = probability of a success= 0.85) 

• Draw the graph. (Number of throws until the first success versus probability) 

• 4. What is the probability that Liz hits the bull's eye for the first time on the third throw? That it 
takes more than three throws for Liz to hit the bull's eye for the first time? That it takes at least three 
throws? 

• X takes on the values . 

• pi = . In words, pi is 



The Geometric Equation 

P(X = x) = q x ~ l -p 

Hypergeometric Distribution 

The hypergeometric distribution is characterized by choosing a sample without replacement from two 
distinct groups. One of the two groups is what is of interest in the sample. Some lotteries are based on the 
hypergeometric distribution, click to edit note 

Example 5.2 

Suppose a shipment of 20 tape recorders contains 5 defectives. An inspector randomly chooses 8 
of the tape recorders to inspect. He is interested in the number of defectives in the sample of 8. 
Have the class answer questions similar to those for the binomial and the geometric. 

NOTATION: X ~ H (r, b, n) where r = size of the group of interest, b = size of the other group, and 
n = size of the sample. 
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Poisson Distribution 

The Poisson distribution is concerned with the number of times an event takes place in a certain interval. It 
is used in the field of reliability. The Poisson approximates the binomial when n is "large" (say, more than 
100) and p is "small" (say, less than 0.1). 

Example 5.3 

Suppose the average number of accidents that occur in a week at a particularly busy intersection 
is one. The interval is one week. The average is one accident. Let X = the number of accidents 
that occur in a one week period at the intersection. Have the students help fill in the blanks and 
answer the questions: 

1. X ~ (X ~ P (}i where }i = one accident) 

2. What values does X take on? 

3. What is the probability that at most one accident occurs in a week? 

The Poisson Distribution Formula 

The parameter for the Poisson is the mean, p.. Some books and calculators use the Greek letter, A (lambda) 
as the mean. The equation for the Poisson is: 

P (X = x) = ^—^ — where x = 0,1,2,3,... (5.1) 

x\ 

Assign Practice 

Have the students complete the portion of the practice that is appropriate for what you have covered in 
class. Expected Value, Binomial, and Poisson are dealt with Practice l 2 , Practice 2 3 , and Practice 3 4 . Practice 
4 5 is based on the Geometric Distribution, while Practice 5 6 is focused on reviewing the Hypergeometric 
Distribution. 

Calculator Instructions 

If you are using the TI-83/TI-84 series, there are probability functions for the binomial, Poisson, and ge- 
ometric. Each has a pdf and a cdf (for example binompdf and binomcdf). These functions are located in 
2nd DISTR. If you use, say, binompdf (n,p) , you will get the table of probabilities for 0, 1, 2, ..., n. If you 
use binompdf (n,p) , you will get the probability of x. If you use binomcdf (n, p, x), you will get the 
cumulative probability (P (X = 0) + P (X = 1) + P (X = 2) + ... + P (X = n)). 

Assign Homework 

Assign Homework 7 . Suggested homework: 1-17 odds, 23, 33 - 37 (Binomial and Poisson). 



Discrete Random Variables: Practice 1: Discrete Distributions" <http://cnx.org/content/ml6830/latest/> 
Discrete Random Variables: Practice 2: Binomial Distribution" <http://cnx.org/content/ml7107/latest/> 
Discrete Random Variables: Practice 3: Poisson Distribution" <http://cnx.org/content/ml7109/latest/> 
Discrete Random Variables: Practice 4: Geometric Distribution" <http://cnx.org/content/ml7108/latest/> 
Discrete Random Variables: Practice 5: Hypergeometric Distribution" <http://cnx.org/content/ml7106/latest/> 



7 "Discrete Random Variables: Homework" <http://cnx.org/content/ml6823/latest/> 
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Chapter 6 

Ch. 5: Continuous Random Variables 



This chapter is a good introduction to continuous types of probability distributions (the most famous of all 
is the normal). Two continuous distributions are covered - the uniform (or rectangular) and the exponential. 
For the uniform, probability is just the area of a rectangle. This distribution easily gets across the concept 
that probability is equal to area under a "curve" (a function). The exponential, which is used in industry 
and models decay, is a nice lead-in to the normal. The uniform and exponential distributions are also nice 
distributions to start with when you teach the Central Limit Theorem. It is interesting to note that the 
amount of money spent in one trip to the supermarket follows an exponential distribution. Several of our 
students discovered this idea when they chose data for their second project. 

Compare Binomial v. Continuous Distribution 

Begin this chapter by a comparison of a binomial (discrete) distribution and a continuous distribution. 
Using the normal for this comparison works well because the students are already familiar with it. The 
binomial graph has probability = height and the normal graph has probability = area. Tell the students that 
the discovery of probability = area in the continuous graph comes from calculus (which most of them have 
not studied). Draw the two graphs to make these ideas clear. 

Introduce Uniform Distribution 

Introduce the uniform distribution using the following example: The amount of time a student waits in line 
at the college cafeteria is uniformly distributed in the interval from to 5 minutes (the students must wait 
in line from to 5 minutes - each time in this interval is equally likely). Note: all the times cannot be listed. 
This is different from the discrete distributions. 

Example 6.1 

Let X= the amount of time (in minutes) a student waits in line at the college cafeteria. The notation 
for the distribution is X<~ U (a, b) where a = and b = 5. The function is / (x) = g where 
< x < 5. The pattern is f(x) = ^- where a < x< b. 

In this example a = and b = 5. The function / (x) where < x < 5 graphs as a horizontal line 
segment. 



1 This content is available online at <http://cnx.Org/content/ml6814/l.12/>. 
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CHAPTER 6. CH. 5: CONTINUOUS RANDOM VARIABLES 



UXl 



l<, 



Figure 6.1: Because < x < 5, the maximum area = (15)(5)=1, the largest probability possible. 



Example 6.2 

Find the probability that a student must wait less than 3 minutes. Draw the picture and write the 
probability statement. 

Solution 




Figure 6.2: Probability statement: P(X<3) = (3 - 0) (15) = 35. 



The probability is the shaded area (the area of a rectangle with base = b — a = 3 — = 3 and 
height = j . The probability a student must wait in the cafeteria line less than 3 minutes is | . 



Example 6.3 

Find the average wait time. 

Solution 

]i = ^ = 5+5 = 2.5 minutes 



If the students take the time to draw the picture and write the probability statement, the problem becomes 
much easier. 
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Example 6.4 

Find the 75th percentile of waiting times. A time is being asked for here. Percentiles often confuse 
students. They see "75th" and think they need to find a probability. Draw a picture and write a 
probability statement. Let k = the 75th percentile. 

Solution 



f(X) 



15 



Figure 6.3 



• Probability statement: P (X < k) = 0.75 

• Area: (k - 0) U) = 0.75 k = 3.75 minutes 

75% of the students wait at most 3.75 minutes and 25% of the students wait at least 3.75 minutes. 



Example 6.5 

You can finish the uniform with a conditional. This reviews conditionals from Continuous Ran- 
dom Variables 2 . What is the probability that a student waits more than 4 minutes when he/she 
has already waited more than 3 minutes? 

Solution 



Algebraically: P (X > 4|X > 3) 



P(X>4ANDX>3) _ P(X>4) 



P(X>3) 



P(X>3) 



NOTE: The students see it more clearly if you do the problem graphically. The lower value, a, 
changes from to 3. The upper value stays the same (b = 5). The function changes to: f (x) = 



l _ l 

5-3 2 



"Continuous Random Variables: Introduction" <http://cnx.org/content/ml6808/latest/> 
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CHAPTER 6. CH. 5: CONTINUOUS RANDOM VARIABLES 




1,2 — 



Figure 6.4: P (X > 4|X > 3) = (base) (height) = (5 - 4) (\ 



Introduce the Change Example 

The exponential distribution is generally concerned with how a quantity declines or decays. Examples 
include the life of a car battery, the life of a light bulb, the length of time business long distance telephone 
calls last, and the amount of change a person is carrying. You can introduce the exponential by using the 
change example. Ask everyone in your classroom to count their change and record it. Then have them 
calculate the mean and standard deviation and graph the histogram. The histogram should appear to 
be declining. Let X = the amount of change one person carries. Notation: X ~ Exp (m) where m is the 
parameter that controls the amount of decline or decay; m = j- and y. — -^ . Also, ]i = <r. (In the example, 
the calculated mean and standard deviation ought to be fairly close.) 

Example 6.6 

The function is where / (x) = me_ mx m > AND x > 0. Find the probability that the amount of 
change one person has is less then $.50. Draw the graph. 

f{X) 



111 




Figure 6.5: The right tail extends indefinitely. There is no upper limit in x. 



The formula is P (X < x) = 1 - e mx P (X < .50) = 



The authors use technology to 



solve the probability problems. If you use the TI-83/84 calculator series, enter on the home-screen, 
1 — e _m - 50 . Fill in the m with whatever the data produces ( m — ^; replace ]i with the sample 
mean). 
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Ask the question, "Ninety percent of you have less than what amount?" and have them find the 
90th percentile. 

Draw the picture and let k = the 90th percentile. P (X < k) = 0.90. Solve the equation 1 — e = 
0.90 for k. On the home-screen of the TI-83/TI-84, enter M 1 --^) 

NOTE: Have students fill in the blanks. 

On average, a student would expect to have . The word "expect" implies the mean. Ten 

students together would expect to have . (the mean multiplied by 10) 

Assign Practice 

Assign the Practice l 3 and Practice 2 4 in class to be done in groups. 

Assign Homework 

Assign Homework 5 . Suggested problems: 1 -13 odds, 15 - 20. 



3 "Continuous Random Variables: Practice 1" <http://cnx.org/content/ml6812/latest/> 
4 "Continuous Random Variables: Practice 2" <http://cnx.org/content/ml6811/latest/> 
5 "Continuous Random Variables: Homework" <http://cnx.org/content/ml6807/latest/> 
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Chapter 7 

CH. 6: Normal Distribution 



A fair number of students are familiar with the "bell-shaped" curve. Stress that the normal is a continuous 
distribution like the uniform and exponential. However, the left and right tails extend indefinitely but come 
infinitely close to the x-axis. It is not necessary to show the probability distribution function for the normal 
(it is in the book) because there are normal probability tables and technology available for probability and 
percentile calculations. 

Visualize the Data 

Draw a picture of the normal graph and explain that it is symmetrical about the mean. The shape of the 
graph depends on the standard deviation. The smaller the standard deviation, the skinnier and taller the 
graph. A change in the mean shifts the graph to the right or left. The notation for the normal is X~N (ji, a). 
Draw several normal curves (superimposed upon each other). Have students determine how the means 
and standard deviations are changing. 




Figure 7.1 



lr rhis content is available online at <http://cnx.Org/content/ml6990/l.9/>. 
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CHAPTER 7. CH. 6: NORMAL DISTRIBUTION 




Figure 7.2 



The Normal Distribution Notation 

The standard normal distribution is of special interest. Notation: Z~N (0, 1 ) where Z = one z-score (the 
number of standard deviations a value is to the right or left of the mean). The mean is and the variance 
(and standard deviation) is 1 . Any normal distribution can be standardized to the standard normal by the 

z-score formula: z = va -$ - . Do an example showing the standardization. For X~N (3,2) and Y~N (5,6), 



u 



the values x = 4 and y = 8 are each 1 standard deviation to the right ( ic) of their respective means. 
Therefore, they both have a z-score of \ . 

Example 7.1 

Do an example using the normal distribution and the standardization. 

Problem 

Several studies have shown that the amount of time people stand in line waiting for a bank teller 
is normally distributed. Suppose the mean waiting time is 3 minutes and the standard deviation is 
1.5 minutes. Let X = the amount of time, in minutes, one person stands in line waiting for a teller. 
Notation: X~N (3, 1.5) 

Find the probability that one person waits in line for a teller less than 2 minutes. Have students 
draw the picture and write a probability statement. The picture should have the x-axis. 

Solution 




Figure 7.3: Probability statement: P (X < 2) = 0.2500. If you use the TI-83/84 series, the function 
normalcdf (0, 2, 3, 1 . 5) in 2nd DISTR. k = 5.47 
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Figure 7.4: P (X < k) = 0.95. If you use the TI83/84 series of articles, use the function 
InvNorm (.95,3,1.5). k = 5.47 



NOTE: The normal approximation to the binomial is NOT included in this text. With graphics 
calculators and computer software, it is easy to draw a binomial graph with a small n and then 
make n, say, 50. Students will see the graph approach the normal. The normal approximation 
states that if X follows a binomial distribution with number of trials equal to n and probability 
of success for any trial equal to p (X~B (n, p)), then by adding ±0.5 to X, you get a new random 
variable Y ( Y is either X + 0.5or X — 0.5) and Y follows a normal distribution (Y~N (np,npq)). 
For the approximation to be a good one, you want np > 5, nq > 5, and n > 20. 

Assign Practice 

Assign the Practice 2 in class to be done in groups. 

Assign Homework 

Assign Homework 3 . Suggested problems: 1-11 odds, 8, 10, 12 - 19. 



2 "Normal Distribution: Practice" <http://cnx.org/content/ml6983/latest/> 
3 "Normal Distribution: Homework" <http://cnx.org/content/ml6978/latest/> 
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Chapter 8 

Ch. 7: Central Limit Theroem 



The Central Limit Theorem (CLT) is considered to be one of the most powerful theorems in all of statistics 
and probability. It states that if you draw samples of size n and average (or sum) them, you will get a 
distribution of averages (or sums) that follow a normal distribution. 

Example 8.1 

Suppose y. and a are the original mean and standard deviation of the population from which each 

sample of size n is drawn. Let X= the random variable for the average of n samples. Let J^x = the 
random variable for the number of n samples 



X 



N (^) 



L* ~ N (n]i,n&) 

The Dice Experiment 

At the beginning of the chapter, there is a dice experiment. Together with the students, do the experiment. 
The example consists of rolling 10 times each, 1 die, 2 dice, 5 dice, and 10 dice and averaging the faces. 
Draw graphs (histograms are OK). This experiment, most of the time, shows that, as the number of dice 
increase, the graph looks more and more bell-shaped. Because the samples taken are usually small, you 
will not necessarily get a perfect bell-shaped curve. However, the students should get the idea. 

Example 8.2 
Calculate Averages 

It can be shown that the average amount of money one person spends on one trip to a particular 
supermarket is $51. The averages follow an exponential distribution. 

Problem 

Find the probability that the average of 40 samples is more than $60. 

Solution 

Let X= the average amount of money that 40 people spend. Have the students draw the appro- 
priate picture, labeling the x-axis with X- The mean ]i — 51 and the standard deviation fi — 51. If 
you are using the TI-83/84 series, use the function normalcdf (60, 10~99, 51, 51/40). 

The 75th percentile for the average amount spent by 40 people at the supermarket is $56.44. This 
means that 75% of the people spend no more than $56.44 and 25% spend no less than that amount. 

This can be calculated by using the TI-83/84 function InvNorm( .75, 51, 51/ 40). 



lr rhis content is available online at <http://cnx.Org/content/ml6957/l.8/>. 
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Calculate Sums 

You can also do examples for sums. We, the authors, do not do sums because of time (we are on a quarter 
system). Help the students to find the probability that the total (sum) amount of money spent by 10 people 
at the supermarket is less than $500. Also, help them do a percentile problem. 

Z-score Formulas 

If you want to teach the z-score formulas for averages and sums, they are: 

value— u 

value— «-w 

• Z = r £ 

Assign Practice 

Assign the Practice 2 in class to be done in groups. 

Assign Homework 

Assign Homework 3 . Suggested homework: (averages) la - f, 3, 5, 9, 10, 11a - d, f, k, 13a-c,g-j, 16, 17, 19 - 23 



2 "Central Limit Theorem: Practice" <http://cnx.org/content/ml6954/latest/> 
3 "Central Limit Theorem: Homework" <http://cnx.org/content/ml6952/latest/> 



Chapter 9 

Ch. 8: Confidence Intervals 



Confidence intervals can be difficult for students. This chapter discusses confidence intervals for a single 
mean and for a single proportion. In this course, we do not deal with confidence intervals for two means 
or two proportions. For a single mean, confidence intervals are calculated when a is known and when c is 
not known (s is used as an estimate for a). 

Book notation: 

• CL = confidence level 

• EBM = error bound for a mean 

• EBP = error bound for a proportion 

The student-t distribution in introduced in this chapter beginning with a little history: 

NOTE: William Gossett derived the t-distribution in 1908. He needed a method for dealing with 
small samples (less than 30) in his research on temperature at the Guinness Brewery. Legend has 
it that the name Student-t comes from the fact that Gossett wrote a paper about the t-distribution 
and signed the paper Student because he was too modest to use his own name. 

If you sample from a normal distribution in which c is not known, replace a with s, the sample standard 
deviation, and use the Student-t distribution. The shape of the curve depends on the parameter degrees of 
freedom (df). df = n — 1 where n is the sample size. 

NOTE: t J designates the distribution. We use T as the random variable. Value is an average. 

The t-statistic (t-score) 

, _ value— fi 

The relationship between the confidence interval for a single mean (when a and the confidence level can be 
shown in a picture as follows: 



^his content is available online at <http://cnx.Org/content/ml6975/l.ll/>. 
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CHAPTER 9. CH. 8: CONFIDENCE INTERVALS 



CL=l-a 



UJ2 




-z 

a/2 



a/2 



Figure 9.1: The § subscript indicates that the area to the right is §. 



Formulas for the error bounds: 

• Single mean (known a): EBM = z* - \ — — 

• Single mean (unknown a): EBM = (*•(-?=) 

• Binomial proportion: EBP = z * • y ££- where q 

The confidence intervals have the form: 



Single mean (unknown or known c): I X —EBM, X +EBM I 
Binomial proportion: (p' — EBP, p' + EBP) 

Example 9.1 

The number of calories in fast food is always of interest. A survey was taken from 7 fast food 
restaurants concerning the number of calories in 4 ounces of french fries. The data is 296, 329, 306, 
324, 292, 310, 350. Construct a 95% confidence interval for the true average number of calories in 
a 4 ounce serving of french fries. 

Solution 

You want a confidence interval for a single mean where a is not known. If you use the TI-83/84 
series, enter the data into a list and then use the function TInterval, data option. C-level is 95. 
The confidence interval is (296.4, 334.2). This function also calculates the sample mean (315.3) and 
sample standard deviation (20.4). TInterval is found in STAT TESTS . 

If you want the students to use the formulas for a normal or for the Student-t confidence interval, 
you will need to use a table for the z-score or the t-score. The book does not have the tables but 
the Internet has several. Do a search on "z-score table" and "Student-t table." 

First, you need to calculate the sample mean and the sample standard deviation. 

• x = 315.29 

• s = 20.40 
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The confidence interval has the pattern : x — EBM, x +EBM 

The error bound formula is : EBM = t « • ( -f= ) 

CL = 0.95 so a = 0.05. Therefore, § = 0.025. 

Using the Student-t table with d/ = 7 - 1 = 6 , f 25 = 2.45. 



0.025 



0.90 



-2.45 2.45 



Figure 9.2 



0.025 




EBM = t* 

2 



f 0.25 



20.40 
V7 



2.45 ■ ^ = 18.89 



The confidence interval is ( x -EBM, x +EBM ) = (315.29 - 18.89, 315.29 +18.89) = (296.4, 334.2) 

We are 95% confident that the true average number of calories in a 4 ounce serving of french fries 
is between 196.4 and 334.2 calories. 



Example 9.2 

At a local cabana club, 102 of the 450 families who are members have children who swam on the 
swim team in 1995. Construct an 80% confidence interval for the true proportion of families with 
children who swim on the swim team in any year. 

Solution 

You want a confidence interval for a single proportion. If you use the TI-83/84 series, use the 
function 1-PropZinterval. x = 102 , n — 450, C— level = 80. The confidence interval is (.2077, 
.2590) 

If you want to use the formulas, first, you need to calculate the estimated proportion. 

/ _ x _ 102 n2 S 
V ~ n ~ 450 ~ v - AO 

The confidence interval has the pattern (p' — EBP, p' + EBP). 
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The error bound formula is EBP — z * • y ^- where q = 1 — p' 
CL = 0.80 so a = 0.20. Therefore, § = 0.10. 

Using the normal table (find one on the Internet), Z\q = 1.28 . (Remind students that 0.10 is the 
area to the right. The area to the left is 0.90.) 



EBP = 2 . .y/E± = z, -y/U- = Z, • V^Z = j.28 . ^ = Qm 

The confidence interval is : (p' - EBP, p' + EBP) = (0.23 - 0.03, 0.23 + 0.03) = (0.20, 0.26) 

We are 80% confident that the true proportion of families that have children on the swim team in 
any year is between 0.20 and 0.26. 



Assign Practice 

Assign the Practice l 2 , Practice 2 3 , and Practice 3 4 in class to be done in groups. 

Assign Homework 

Assign Homework 5 . Suggested homework: 1, 5, 9, 13, 15, 17, 21, 23, 24 - 31. 



2' 



Confidence Intervals: Practice 1" <http 



3 "Confidence Intervals: Practice 2" <http 

4 "Confidence Intervals: Practice 3" <http 



//cnx.org/content/m!6970/latest/> 



//cnx.org/content/ml6971/latest/> 
//cnx.org/content/ml6968/latest/> 
Confidence Intervals: Homework" <http://cnx.org/content/ml6966/latest/> 



Chapter 10 



Ch. 9: Hypothesis Testing of Single 
Mean and Single Proportion 



Hypothesis testing is done constantly in business, education, and medicine to name just a few areas. To 
perform a hypothesis test, you set up two contradictory hypotheses and use data to support one of them. 
Introduce the students to hypothesis testing by an example. Use a table to show the outcomes. Use H as 
the null hypothesis and H fl as the alternate hypothesis. Go over the language "reject H " and "do not reject 
H a ". 

Example 10.1 

H : John loves Marcia. H a : John does not love Marcia. 

• Type I error: Reject the null when the null is true. P(Type I error) = a . 

• Type II error: Do not reject the null when the null is false. P(Type II error) = /3 . 

• Type I error: Marcia thinks John does not love her when he really does. 

• Type II error: Marcia thinks John does love her when he does not. 

Have the students try to write out the errors before you do. They may require a little prompting. 
Then have them state the possible consequences for the errors. 

Conducting a Hypothesis Test 

To perform the hypothesis test, sample data is gathered. The data typically favors one of the hypotheses (but 
not always). The test determines which hypothesis the data favors. If the data favors the null hypothesis, 
we "do not reject" the null hypothesis. If the data does not favor the null hypothesis, we "reject" the null 
hypothesis. To not reject or to reject are decisions. After a decision is reached, an appropriate conclusion is 
made using complete sentences. 

Sometimes the data favors neither hypothesis. In this case, we say the test is inconclusive. 

A hypothesis test may be left-tailed, right-tailed or two-tailed. What the test is concerned with generally 
determines what type of test is being done. 

Associated with the null hypothesis is a pre-conceived a. a = P(Type I error). Students sometimes have a 
difficult time when there is no pre-conceived a. We use a = 0.05 if there is none. 



lr rhis content is available online at <http://cnx.org/content/ml7008/1.8/>. 
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SINGLE PROPORTION 

The data is used to calculate the p-value. The p-value is the probability that the information (data) will 
happen purely by chance when the null hypothesis is true. If we reject the null hypothesis, then we believe 
the information did not happen purely by chance with the current null hypothesis. Therefore, we believe 
that the null hypothesis is not true. 

The decision (to reject or not reject) is based on whether a > p-value or a. < p-value. 

The example in the book concerning Jeffrey, an eight-year old swimmer, is a good first example to do with 
the class. They can follow along in the book and then complete the problem that follows (bench press 
problem). By filling in the blanks, they are led through the steps of hypothesis testing. 

In the beginning, the students have the most difficulty in determining which test to use (test of a single 
mean - normal or Student-t or test a binomial proportion) and the type (left-, right-, or two-tailed). We do 
several examples (usually we choose some homework problems) in class with the students. If a single mean 
Student-t is done, the assumption is that the population from which the data is taken is normal. In reality, 
this would have to be shown to be true. 

Here 2 is a series of solution sheets that can be copied and used by the students to do the hypothesis testing 
problems. A solution sheet makes it clearer to the student what the steps to the tests are. 

Go over the solution for "Fido's Fleas", a binomial proportion hypothesis testing problem written as a poem. 
The problem is at the end of the text portion of the chapter. The solution on a solution sheet follows the 
poem. 

If you use the TI-83/84 series, there are functions to perform the different hypotheses tests. They can be 
found in STAT TESTS. Z-Test (normal test) does a test of a single mean when the population standard devi- 
ation is known; T-test (Student-t test) does a test of a single mean when the population standard deviation 
is not known; 1-PropZTest (normal test) does a test of a single proportion. The examples in the book contain 
TI-83/84 calculator instructions, in detail. 

Assign Practice 

Assign Practice l 3 , Practice 2 4 , and Practice 3 5 to be done collaboratively. 

Assign Homework 

Assign Homework 6 . Suggested problems: 1 - 15 odds, 19, 21, 25, 29, 31, 33, 34 - 44. 

Assign Projects 

There are two partner projects for this lesson: one uses an article 7 and the other is a word problem 8 . Stu- 
dents create their own hypothesis testing problems and learn much from the process. 



Collaborative Statistics: Solution Sheets: The Chi-Square Distribution" <http://cnx.org/content/ml7136/latest/> 



Hypothesis Testing of Single Mean and Single Proportion: Practice 1" <http:. 

Hypothesis Testing of Single Mean and Single Proportion: Practice 2" <http:. 

Hypothesis Testing of Single Mean and Single Proportion: Practice 3" <http:. 

Hypothesis Testing of Single Mean and Single Proportion: Homework" <http://cnx.org/content/ml7001/latest/> 

Collaborative Statistics: Projects: Hypothesis Testing Article" <http://cnx.org/content/ml7140/latest/> 

Collaborative Statistics: Projects: Hypothesis Testing Word Problem" <http://cnx.org/content/ml7144/latest/> 



//cnx.org/content/ml7004/latest/> 
//cnx.org/content/ml7016/latest/> 
//cnx.org/content/ml7003/latest/> 



Chapter 11 

Ch 10: Hypothesis Testing of Two Means 
and Two Proportions 



The comparison of two groups is done constantly in business, medicine, and education, to name just a few 
areas. You can start this chapter by asking students if they have read anything on the Internet or seen on 
television any studies that involve two groups. Examples include diet versus hypnotism, Bufferin® with 
aspirin versus Tylenol®, Pepsi Cola® versus Coca Cola®, and Kellogg's Raisin Bran® versus Post Raisin 
Bran®. There are hundreds of examples on the Internet, in newspapers, and in magazines. 

This chapter covers independent groups for two population means and two population proportions and 
matched or paired samples. The module relies heavily on technology. Instructions for the TI-83/84 series 
of calculators are included for each example. If you and your class are interested, the formulas for the test 
statistics are included in the text. 

Doing problems 1 - 10 in the Homework 2 helps the students to determine what kind of hypothesis test they 
should perform. 

Example 11.1: Matched or Paired Samples 

A course is designed to increase mathematical comprehension. In order to evaluate the effective- 
ness of the course, students are given a test before and after the course. The sample data is: 



Before Course 


90 


100 


160 


112 


95 


190 


125 


After Course 


120 


95 


150 


150 


100 


200 


120 



Table 11.1 

Example 11.2: Two Proportions, Independent Groups 

Suppose in the last local election, among 240 30-45 year olds, 45% voted and among 260 46-60 year 
olds, 50% voted. Does the data indicate that the proportion of 30-45 year olds who voted is less 
than the proportion of 46-60 year olds? Test at a 1% level of significance. 

Firm A: 

• N A = 20 

• S A = $100X A = $1500 



lr rhis content is available online at <http://cnx.Org/content/ml7020/l.8/>. 

2 "Hypothesis Testing of Two Means and Two Proportions: Homework" <http://cnx.org/content/ml7023/latest/> 
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CHAPTER 11. CH10: HYPOTHESIS TESTING OF TWO MEANS AND TWO 

PROPORTIONS 



Firm B: 



• N B = 22 

• S B = $200X b = $1900 

Test the claim that the average price of Firm A's laptop is no different from the average price of 
Firm B's laptop. 

Calculator Instructions 

If you use the TI83/84 series, the functions are located in STATS TESTS. The function for two proportions 
is 2-PropZTest, the function for two means is 2-SampTTest if the population standard deviations are not 
known and 2-SampZTest if the population standard deviations are known (highly unlikely). The function 
for matched pairs is T-test (the same test used for test of a single mean) because you combine two measure- 
ments for each object into a single set of "difference" data. For the function 2-SampTTest, answer "NO" to 
"Pooled." 

Assign Practice 

Have students do the Practice l 3 and Practice 2 4 collaboratively in class. These practices are for two pro- 
portions and two means. For matched pairs, you could have them do Example 10-7 in the text. 

Assign Homework 

Assign Homework 5 . Suggested homework problems: 1 - 10, 11, 13, 15, 17, 19, 23, 25, 31, 39 - 52. 



3 "Hypothesis Testing: Two Population Means and Two Population Proportions: Practice 1" 
<http://cnx.org/content/ml7027/latest/> 

4 "Hypothesis Testing: Two Population Means and Two Population Proportions: Practice 2" 
<http://cnx.org/content/ml7039/latest/> 

'"Hypothesis Testing of Two Means and Two Proportions: Homework" <http://cnx.org/content/ml7023/latest/> 



Chapter 12 

Ch 11: The Chi-Square Distribution 



This chapter is concerned with three chi-square applications: goodness-of-fit; independence; and single 
variance. We rely on technology to do the calculations, especially for goodness-of-fit and for independence. 
However, the first example in the chapter (the number of absences in the days of the week) has the student 
calculate the chi-square statistic in steps. The same could be done for the chi-square statistic in a test of 
independence. 

The chi-square distribution generally is skewed to the right. There is a different chi-square curve for each 
df. When the df 's are 90 or more, the chi-square distribution is a very good approximation to the normal. 
For the chi-square distribution, y. = the number of df 's and a = the square root of twice the number of df 's. 

Goodness-of-Fit Test 

A goodness-of-fit hypothesis test is used to determine whether or not data "fit" a particular distribution. 

Example 12.1 

In a past issue of the magazine GEICO Direct, there was an article concerning the percentage 
of teenage motor vehicle deaths and time of day. The following percentages were given from a 
sample. 

Time of Day Percentage of Motor Vehicle Deaths 



Time of Day 


Death Rate 


12 a.m. to 3 a.m. 


17% 


3 a.m. to 6 a.m. 


8% 


6 a.m. to 9 a.m. 


8% 


9 a.m. to 12 noon 


6% 


12 noon to 3 p.m. 


10% 


3 p.m. to 6 p.m. 


16% 


6 p.m. to 9 p.m. 


15% 


9 p.m. to 12 a.m. 


19% 



Table 12.1 



lr rhis content is available online at <http://cnx.Org/content/ml7060/l.ll/>. 
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For the purpose of this example, suppose another sample of 100 produced the same percentages. 
We hypothesize that the data from this new sample fits a uniform distribution. The level of signif- 
icance is 1% (a = 0.01 ). 

• H : The number of teenage motor vehicle deaths fits a uniform distribution. 

• H a : The number of teenage motor vehicle deaths does not fit a uniform distribution. 

The distribution for the hypothesis test is X 2 

The table contains the observed percentages. For the sample of 100, the observed (O) numbers are 
17, 8, 8, 6, 10, 16, 15 and 19. The expected (E) numbers are each 12.5 for a uniform distribution (100 
divided by 8 cells). The chi-square test statistic is calculated using 

v- (o-£) 2 



(17-12.5) 2 , (8-12.5) 2 , (8-12.5) 2 , (6-12.5) 2 , (10-12.5) 2 , (16-12.5) 2 , (15-12.5) 2 , (19- 



I 1TC T n: T" n c: T n: "T i n c "T 



12.5 ' 12.5 ' 12.5 ' 12.5 ' 12.5 ' 12.5 ' 12.5 ' 12.5 

= 13.6 

If you are using the TI-84 series graphing calculators, ON SOME OF THEM there is a function in 
STAT TESTS called x 2 GOF-Test that does the goodness-of-fit test. You first have to enter the ob- 
served numbers in one list (enter as whole numbers) and the expected numbers (uniform implies 
they are each 12.5) in a second list (enter 12.5 for each entry: 100 divided by 8 = 12.5). Then do the 
test by going to x 2 GOF-Test. 

If you are using the TI-83 series, enter the observed numbers in listl and the expected numbers in 
list2 and in list3 (go to the list name), enter (listl-list2) A 2/list2. Press enter. Add the values in list3 
(this is the test statistic). Then go to 2nd DISTR x 2 cdf. Enter the test statistic (13.6) and the upper 
value of the area (10 A 99) and the degrees of freedom (7). 

Probability Statement: P (x 2 > 13.6) = 0.0588 

(Always a right-tailed test) 




Figure 12.1: p-value = 0.0588 



Since a < p-value (0.01 < 0.0588), we do not reject H . 



We conclude that there is not sufficient evidence to reject the null hypothesis. It appears that the 
number of teenage motor vehicle deaths fits a uniform distribution. It does not matter what time 
of the day or night it is. Teenagers die from motor vehicle accidents equally at any time of the day 
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or night. However, if the level of significance were 10%, we would reject the null hypothesis and 
conclude that the distribution of deaths does not fit a uniform distribution. 

A test of independence compares two factors to determine if they are independent (i.e. one factor does not 
affect the happening of a second factor). 

Example 12.2 

The following table shows a random sample of 100 hikers and the area of hiking pre- 
ferred. 

Hiking Preference Area 



Gender 


The Coastline 


Near Lakes and Streams 


On Mountain Peaks 


Female 


18 


16 


11 


Male 


16 


25 


14 



Table 12.2: The two factors are gender and preferred hiking area. 



• H : Gender and preferred hiking area are independent. 

• H a : Gender and preferred hiking area are not independent 

The distribution for the hypothesis test is xi. 

The df's are equal to: (rows — 1) (columns — 1) = (2 — 1) (3 — 1) 

The chi-square statistic is calculated using D(2-3) — j; 

t^i i. j /t?\ i ■ iiij • (rowtotal) (columntotal) 

Each expected (E) value is calculated using * totafsurviyed 

The first expected value (female, the coastline) is ^j%f = 15.3 
The expected values are: 15.3, 18.45, 11.25, 18.7, 22.55, 13.75 
The chi-square statistic is: 



E(2-3) 



(0-E) 



(18-15.3) 
15.3 

= 1.47 



+ 



(16-18.45) 
18.45 



(11-11.15) (16-18.7) 7 (25-22.55) (14-13.75 



11.25 



18.7 



+ 



22.55 



13.75 



Calculator Instructions 

The TI-83/84 series have the function x 2 -Test in STAT TESTS to preform this test. First, you have 
to enter the observed values in the table into a matrix by using 2nd MATRIX and EDIT [A]. Enter 
the values and go to x 2 -Test. Matrix [B] is calculated automatically when you run the test. 

Probability Statement: p-value = 0.4800 (A right-tailed test) 
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Figure 12.2: p-value = 0.4800 



Since a. is less than 0.05, we do not reject the null. 

There is not sufficient evidence to conclude that gender and hiking preference are not independent. 

Sometimes you might be interested in how something varies. A test of a single variance is the type of 
hypothesis test you could run in order to determine variability. 

Example 12.3 

A vending machine company which produces coffee vending machines claims that its machine 
pours an 8 ounce cup of coffee, on the average, with a standard deviation of 0.3 ounces. A college 
that uses the vending machines claims that the standard deviation is more than 0.3 ounces causing 
the coffee to spill out of a cup. The college sampled 30 cups of coffee and found that the standard 
deviation was 1 ounce. At the 1% level of significance, test the claim made by the vending machine 
company. 

Solution 

H : a 2 = (0.3) 2 H a : a 2 > (0.3) 2 



The distribution for the hypothesis test is x 2 9 where df = 30 — 1 = 29. 
The test statistic x 2 = ( "~ 1 2 ) ' s2 = (30 ~ 1 2 ) ' 12 = 322.22 

cr z 0.3 Z 

Probability Statement: P (x 2 > 322.22) = 




322.22 



Figure 12.3: p-value = 



Since a > p-value (0.01 > 0), reject H . 

There is sufficient evidence to conclude that the standard deviation is more than 0.3 ounces of 
coffee. The vending machine company needs to adjust their machines to prevent spillage. 
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Assign Practice 

Have the students do the Practice l 2 , Practice 2 3 , and Practice 3 4 in class collaboratively. 

Assign Homework 

Assign Homework 5 . Suggested homework: 3, 5, 7 (GOF), 9, 13, 15 (Test of Indep.), 17, 19, 23 (Variance), 24 
- 37 (General) 



2< 



The Chi-Square Distribution: Practice 1" <http 

3 "The Chi-Square Distribution: Practice 2" <http 

4 "The Chi-Square Distribution: Practice 3" <http 



//cnx.org/content/ml7054/latest/> 
//cnx.org/content/ml7056/latest/> 
//cnx.org/content/m!7053/latest/> 



"The Chi-Square Distribution: Homework" <http://cnx.org/content/ml7028/latest/> 
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Chapter 13 

Ch 12: Linear Regression and 
Correlation 1 



Entire courses are given on linear regression and correlation. This chapter serves as an introduction to the 
topics. 

It helps to review the equation of a line. We use a for the y-intercept and b for the slope. The line has the 
form: y = a + bx 

Example 13.1 

Have the students plot a line by eye using the following data. The independent variable x repre- 
sents the size of a color television screen in inches at Anderson's and y represents the sales price 
in dollars. 



X 


9 


20 


27 


31 


35 


40 


60 


y 


147 


197 


297 


447 


1177 


2177 


2497 



Table 13.1 

Ask them what they got for the slope and for the y-intercept. Make comparisons. This exercise 
should point out how difficult it is to get an accurate line of best fit and how many lines "seem" to 
fit the data. (This data is taken from the exercises.) 

Solution 

For the data above, use either a calculator or a computer and calculate the least squares or best fit 
line. Look at the scatter plot first. Ask the students if their "by eye" line looks like the calculated 
one. Explain the correlation coefficient and then check if the correlation coefficient is significant by 
comparing it to the correct entry in 95% CRITICAL VALUES OF THE SAMPLE CORRELATION 
COEFFICIENT Table at the end of the reading. 

If you use the TI-83/84 series, enter the data into two lists first. Then plot the data points on the 
calculator. First set up the stat plot (2nd STAT PLOT). Then press ZOOM 9 to see the plot. To do 
the linear regression, go to the LinReg (a +bx) function in STAT CALC. Enter the lists. At this 
time, you could also enter a y-variable after the lists (after you enter the lists, enter a comma and 
then press VARS Y-VARS Function Yl). Press ENTER to see the linear regression. When you press 
GRAPH, the line will plot. 



1 This content is available online at <http://cnx.Org/content/ml7084/l.10/>. 
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Line of best fit: yhat = -745.2420 + 54.7557x. 



Explain "predicting" (or forecasting) and have them predict the sales price of a 45 inch screen color TV. Have 
them predict the cost for a mini 5 inch color TV. (The answer is negative.) Discuss that the line is only valid 
from the lowest to the highest x - values. 

Example 13.2 

Have the students follow the "outlier" example in the text and (just once!) do the calculations for 
finding an outlier. Have them fill in the table below. 



X 


y 


y — yhat 


1 y - y^t | 


(|y-yhat|) 2 













Table 13.2 



Find: Ez ( I V ~ yhat 



Find s - \ / §=§ 



SSE 



n =the total number of data values (7 for this problem) 

s is the standard deviation of the | y — yhat | values 

Multiply s by 1.9: (1.9) (s) = 

Compare each | y — yhat | to (1.9) (s). 

If any | y — yhat | is at least (1.9) (s), then the corresponding point is an outlier. (None of the 
points is an outlier.) 

Assign Practice 

Have the students do the Practice 2 collaboratively in class. 

Assign Homework 

Assign Homework. 3 Suggested homework: 1, 3, 5, 9, 13, 15 (a - f only if you use the calculator), 21 - 25. 



2 "Linear Regression and Correlation: Practice" <http://cnx.org/content/ml7088/latest/> 
3 "Linear Regression and Correlation: Homework" <http://cnx.org/content/ml7085/latest/> 



Chapter 14 

Ch 13: F Distribution and ANOVA 



HISTORY: The F distribution is named after Ronald Fisher. Fisher is one of the most respected 
statisticians of all time. He did a lot of statistical work in biology and genetics and became chair 
of genetics at Cambridge University in England in 1949. In 1952, he was awarded knighthood. 

This section is a very brief overview of the F distribution and two of its applications - One Way Analysis of 
Variance (ANOVA) and test of two variances. There are college courses which deal exclusively with these 
topics. ANOVA, particularly, is used regularly in industry. 

Explanation of Sum of Squares, Mean Square, and the F ratio for ANOVA 



k = the number of different groups 

n j = the size of the jth group 

Sj= the sum of the values in the jth group 



• N = the total number of all the values combined 



Total sample size: yjn, 

x = one value: J^x — j^Sj 

Sum of squares of all values from every group combined: £ x 2 

Between group variability: SS tota i = 7J x 2 — ^^' 

Total sum of squares: J^x 2 — - ^' 

Explained variation- sum of squares representing variation among the different samples SSb e tween 



Ml! 



(Efi) 



'/■ 

Unexplained variation- sum of squares representing variation within samples due to chance: 

''''within ''''total ''''between 

df's for different groups (df's for the numerator): dft, etween — k—1 

Equation for errors within samples (df's for the denominator): df W j t hj n = N — k 

Mean square (variance estimate) explained by the different groups: MSb e tween = , f t " ;;twccn 

cc 

Mean square (variance estimate) that is due to chance (unexplained): MS wit h in = gj 
F ratio or F statistic of two estimates of variance: F — "'' 



^within 



MS» 



NOTE: The above calculations were done with groups of different sizes. If the groups are the same 
size, the calculations simplify somewhat and the F ratio can be written as: 



1 This content is available online at <http://cnx.org/content/ml7073/1.9/>. 
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F Ratio Formula 



F = 



n -(s- x y 
'pooled 



(14.1) 



where.. 



• (s- x ) —the variance of the sample means 

• n =the sample size of each group 

• ( Spooled ) =the mean of the sample variances (pooled variance) 

• ClI numer ator — K 1 

• df denominator = k(n-l) = N-k 

These calculations are easily done with a graphing calculator or a computer program. We present the 
information in the chapter assuming some kind of technology will be used. 

For ANOVA, the samples must come from normally distributed populations with the same variance, and 
the samples must be independent. The ANOVA test is right-tailed. 

In a test of two variances, the samples must come from normal populations and must be independent of 
each other. 

Exercise 14.1 (Solution on p. 46.) 

(One-Way ANOVA) 

Three different diet plans are to be tested for average weight loss. For each diet plan, 4 dieters are 
selected and their weight loss (in pounds) in one month's time is recorded. 



Planl 


Plan 2 


Plan 3 


5 


3.5 


8 


4.5 


7 


4 


4 


6 


3.5 


3 


4 


4.5 



Table 14.1 

Is the average weight loss the same for each plan? Conduct an ANOVA test with a 1% level of 

significance. 

Exercise 14.2 (Solution on p. 47.) 

(Test of Two Variances): 

Machine A makes a box and machine B makes a lid. For the lid to fit the box correctly, the 
variances should be nearly the same. There is a suspicion that the variance of the box is greater 
than the variance of the lid. The following data was collected. 





Machine A (Box) 


Machine B (Lid) 


Number of Parts 


9 


11 


Variance 


150 


45 



Table 14.2 



45 



Are the machines working properly? Test at a 5% level of significance. 

Assign Practice 

Have the students work collaboratively to complete the Practice 2 . 

Assign Homework 

Assign Homework 3 . Suggested homework: 1, 3, 4, 5. 



2 "F Distribution and ANOVA: Practice" <http://cnx.org/content/ml7067/latest/> 
3 "F Distribution and ANOVA: Homework" <http://cnx.org/content/ml7063/latest/> 
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Solutions to Exercises in Chapter 14 

Solution to Exercise 14.1 (p. 44) 

Let \i\, \ii, and ^3 be the population means for the three diet plans. 

• H : ]i x = ji 2 = Ji 3 

• H fl : Not all pairs of means are equal. 



• cit numera t or — o 1 — 2. 

• Clldenominator Iz a y 

The distribution for the test is £2,9 

Using a calculator or computer, the test statistic is F = 0.47. The notation used for the F statistic may also 
be F or F 2/9 (like the distribution). The TI-83/84 series has the function ANOVA in STAT TESTS. Enter the 
lists of data separated by commas. 

If you use the formulas for groups of the same size, the calculations are as follows: 

Sample means are 4.13, 5.13, and 5, respectively. Sample standard deviations are 0.8539, 1.6250, and 2.0412, 
respectively. 



(s- x ) 2 = 0.2956 


The variance of the sample means 


(spooled)' =2.5416 


The mean of the sample variances 


n = 4 


The sample size of each group 



Table 14.3 



Probability Statement: P (F > 0.47) 



F 
0.6395 



4 • 0.2956 
2.5416 



(14.2) 



47 




.47 



Figure 14.1: p-value = 0.6395 



Since a < p-value, do not reject H . 

There is not sufficient evidence to conclude that the three diet plans are different. It appears that the three 

diet plans work equally well. The average weight loss is the same for all three plans. 

Solution to Exercise 14.2 (p. 44) 

Let a A 2 and (Tgibe the population variances for machine A and machine B, respectively 



• 


H : a A 2 — <J B 2 




• 


H a ■ (T A 2 > U B 2 




• 


n A =9 




• 


M B = H 




• 


Cltnumerator ~ " *■ 




• 


^tdenominator ^^ *■ ~~ 


= 10 



The distribution for the hypothesis test is F sw 

If you are using the TI-83/84 calculators, use the function 2-SAMPFTest for the test. 

Using the formulas, 
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The test statistic is F 



{'aY 



(«s) 



{sa) _ 150 



3.33 



(»b) ~ 45 
Since a > p-value, reject the null hypothesis. 

There is sufficient evidence to conclude that the box and lid do not fit each other. The variance of the box is 
larger. 



INDEX 
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Index of Keywords and Terms 

Keywords are listed by the section with that keyword (page numbers are in parentheses). Keywords 
do not necessarily appear in the text of the page. They are merely associated with that section. Ex. 
apples, § 1.1 (1) Terms are referenced by the page they appear on. Ex. apples, 1 



B binomial, § 5(11) 

C Collaborative, § 1(1) 
continuous, § 6(15) 
Course, § 1(1) 

D descriptive, § 3(5) 
discrete, § 5(11) 
distribution, § 5(11), § 6(15) 

E Elementary, § 1(1), § 2(3), § 3(5), § 4(7), § 5(11), 
§ 6(15), § 7(21), § 8(25), § 9(27), § 10(31), 
§ 11(33), § 12(35), § 13(41), § 14(43) 
exponential, § 6(15) 

F function, § 5(11), § 6(15) 

G geometric, § 5(11) 



Guide, § 1(1), § 3(5), § 4(7), § 5(11), § 6(15) 

H hypergeometric, § 5(11) 

P Plan, § 1(1) 
Poisson, § 5(11) 
probability, § 4(7), § 5(11), § 6(15) 

R random, §5(11), §6(15) 

S Statistics, § 1(1), § 2(3), § 3(5), § 4(7), § 5(11), 
§ 6(15), § 7(21), § 8(25), § 9(27), § 10(31), 
§11(33), §12(35), §13(41), §14(43) 

T Teacher, § 1(1), § 3(5), § 4(7), § 5(11), § 6(15) 

U uniform, § 6(15) 

V variable, § 5(11), § 6(15) 
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